Biomolecule Sequencer: 
Next-Generation DNA Sequencing Technology for In-flight 


Environmental Monitoring, Research, and Beyond 


David J. Smith, Ph.D. 
NASA Ames Research Center 


FANSL OLS) 00s w (or) 0 Or: Thome Roxen nv ele) (eyed (ome Lorsty (0) ES 
October 28, 2016 


~aG8G0008- 


GGG 000s7- 


~aQQGR00s7- 


~agGGG00s7- 


Ne CO WN 


www.nasa.gov 


re e®eeee @ i 


reeeeee@i 


April 19: The first molecular 


biology assay in space is 
ompleted, as DNA is amplified 
using the miniPCR™ thermal 


cycler 


April 29: RNA isolation, 
reverse transcription, and DNA 
Pao} ovevecercialeyemercircmeolr-buatcemeyel 

the ISS using the Wetlab-2 
qPCR platform 
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Biomolecule Sequencer Payload 


¢ First attempt at DNA sequencing in the 
microgravity environment of space 


¢ Enabled by the MinION™, developed by Oxford 
INETalo) oe) comm Rexel svete) (oycatens 
¢ COTS miniature DNA Sequencer 
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¢ Less than 120 grams (with USB cable) 
MinION™ by Oxford ¢ Powered via USB connection 
Nanopore Technologies * Capable of DNA, RNA, and protein sequencing 
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~1 year to certify a payload for flight 
Class 1E is a streamlined certification process 
Reduce the time it takes to get scientific payloads to 


the ISS and increase utilization as a National 
Laboratory 


Authority to proceed February, 2015 
Hardware delivered on December 18, 2015 
Launched July 18, 2016 (SpaceX CRS-9) 


Technology Demonstration operations occurred on 
August 26, Sept. 3 and Sept. 7, 2016 
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Biomolecule Sequencer: The Team 
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Payload Development | | | External Science Team 
* Aaron Burton, Ph.D. (PI) | 98> Charles Chiu, Ph.D. (UCSF) 
NASA JSC e Scot Federman 
e Sarah Castro-Wallace, Ph. e Sneha Somasekar 
NASA JSC ¢ Doug Stryke 
¢ Kristen John, Ph.D. (Depu e Guixia Yu 
and PE) . 
NASA JSC * Chris Mason, Ph.D. (WCMC) 
e Sarah Stahl, M.S., (PS) “" we , ¢ Noah Alexander 
NASA JSC > ee ee = ¢ Alexa McIntyre 
Astronaut 
¢ Kate Rubins, Ph.D. 
NASA JSC 
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Why do we need a DNA sequencer to support 
the human exploration of space? 
¢ Operational environmental monitoring 
¢ Identification of contaminating microbes 
¢ Infectious disease diagnosis 
° Research 
¢ Human 
¢ Animal 
¢ Microbes/Cell lines 
¢ Plant 
ey (re @) 0} 
¢ Response to countermeasures 
¢ Radiation 


¢ Functional testing for integration into robotics for 
Mars exploration missions 
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Biomolecule Sequencer: The Benefits 


¢ Benefits to In-flight Sequencing 


¢ Sequencing on the ISS can inform real-time 
decisions (remediation strategies, research, med 
ops, etc.) 

¢ Unlike other technologies, sequencing is not 
limited to the detection of specific targets, but 
rather will provide data on the entirety of a 
sample 

* Reduce down mass (sample return for 
environmental monitoring, crew health, etc.) 

¢ Real-time analysis can influence medical 
intervention 

¢ Support astrobiology science investigations 


* Technology superiorly suited to in situ nucleic acid- 
based life detection 
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Flow Cell: Contains the nanopore 
sensing technology that is required to 
perform the sequencing reaction 
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Biomolecule Sequencer: Nanopore Sequencing ro , 


Nanopore-based sequencers 
measure changes in current 
caused by DNA strands 
migrating through the pore. The 
changes in current are 
characteristic of the sequence of 
migrating DNA. 
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Manufacturer 


Illumina 


Illumina 


PacBio 


IonTorrent 


Oxford 
Nanopore 
Technologies 
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MiSeq 


MiniSeq 


Sequel 


Ion PGM 


MinION 


Mass-based Rone 
cost to 
Sequencing methodology transport area 
hardware to (Ib) 
the ISS 


Fluorescence: each nucleotide 


12 
has a different fluorophore $1,200,000 


Fluorescence: each nucleotide 


has a different fluorophore $990,000 


Fluorescence: each nucleotide 

has a different fluorophore ESD UL 
Electrochemical: measures 
voltage changes caused by 
nucleotide addition 
Electrochemical: measures 
DNA-mediated changes in 
current passing through 
nanopores 


$650,000 
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Biomolecule Sequencer: Comparison to other Sequencers 


Power 
requirements 


100 - 240V AC, 
10A, 400 W 


100 - 240V AC, 15A 


208 - 240V AC, 30A 


100 - 240V AC, 9A, 
200 - 300 W 


USB3 (5V, < 1A) 1 
W 
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Special 
requirements 


Pressurized N, (50 
PSI) 


Pressurized N, (35 - 
AS PSI) 


Project Goals: 

1. ‘Test the basic functionality 
lonyacereynnyey-luneteas tots) 
sequencing results of pre- 
(eleiKoruaevbatere mncy-00018) (on mne) 
ground results 

2. Evaluate crew operability 
and potential for degrees of 
autonomy 
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Biomolecule Sequencer: DNA Samples 


Experimental Design: 


Sequence a ground-prepared sample containing a 
mixture of genomic DNA from: 

¢ Bacteriophage lambda 

¢ Escherichia coli 

¢ Mouse — BALB/C (female) 
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4. Destow and connect MinION 
to Surface Pro3 


5. Sample injection 


3. Remove flow cell and sample syringe 
from cold stowage and allow to equilibrate 
Komvenley(oelt ISS emer Si 
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1. Launch 
\ packaged items 


6. Dispose of 
sample syringe 


7. Initiate the 
sequencing experiment 


8. Data collection 


9. Stow used 
flow cell for 
return 


10. Data 
downlink 


11. Stow MinION, 
Surface Pro3, power & 
LON Sereynels 


12. Return of payload 


Biomolecule Sequencer: ISS Test Demo 


¢ 4 batches of libraries were prepared 
containing the genomes to be sequenced 

¢ From the 4 batches, 18 samples were 
produced: 9 for ISS and 9 identical 
ground controls 

¢ Aliquots of all libraries were sequenced 
for quality control 

¢ Synchronous ground controls were 
performed 

¢ 3 sequencing experiments have been 
conducted to date (Aug. 26, Sept. 3, 
Sept. 7, 2016); additional runs are 
jee-veveterel 
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. : The emergence of nanopore-based sequencers greatly expands the reach of sequencing 
bio X1V into low-resource field environments, enabling in situ molecular analysis. In this work, we 
THE PREPRINT SERVER FOR BIOLOGY evaluated the performance of the MinION DNA sequencer (Oxford Nanopore 
Technologies) in-flight on the International Space Station (ISS), and benchmarked its 
performance off-Earth against the MinION, Illumina MiSeq, and PacBio RS II sequencing 
Nanopore DNA Sequencing and Genome Assembly on the International Space Posted September 27,2016 platforms in terrestrial laboratories. The samples contained mixtures of genomic DNA 
on extracted from lambda bacteriophage, Escherichia coli (strain K12) and Mus musculus 
(BALB/c). The in-flight sequencing experiments generated more than 80,000 total reads 
on with mean 2D accuracies of 85 to 90%, mean 1D accuracies of 75 to 80%, and median 
Subject Area read lengths of approximately 6,000 bases. We were able to make directed assemblies of 
= the ~4.7 Mb E. coli genome, ~48.5 kb lambda genome, and a representative M. musculus 
sequence (the ~16.3 kb mitochondrial genome), at 100%, 100%, and 96.7% pairwise 
identity, and de novo assemblies of the lambda and E. coli genomes solely with yielded 
100% and 99.8% genome coverage, respectively, at 100% and 98.5% pairwise identity. 

. e Across all surveyed metrics (base quality, throughput, stays/base, skips/base), no 
D l d l f Ww O Yr k ? observable decrease in MinION performance was observed while sequencing DNA in 

e 


Y space. Simulated runs of in-flight nanopore data using an automated bioinformatic 
AY 


pipeline demonstrated the feasibility of real-time sequencing analysis and metagenomic 
identification of microbes in space. Additionally, cloud and laptop based-assembly 
illustrated the plausibility of automated, de novo genomic assembly from nanopore data 
on the ISS. Applications of sequencing for space exploration include infectious disease 
diagnosis, environmental monitoring, evaluating biological responses to spaceflight, and 
even potentially the detection of extraterrestrial life on other planetary bodies. 
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Biomolecule Sequencer: Results oe, 


ISS1 ISS2. ISS3.  ISS4 


(A) A mixture of equimolar DNA from mouse, E. coli and lambda phage genomes was 
sequenced in parallel on Earth (“Ground”) and in-flight on the ISS (after being 
delivered by a SpaceX Dragon capsule). Synchronous nanopore sequencing runs were 
performed from August 26 to September 13, 2016. 


8/26 93 97 9/13 


Equimolar DNA 


(B) Plot of mean current intensity in picoAmperes (pA; Y-axis) against k-mers (x- 2 ————— 
axis) in order of increasing mean current based on a model distribution from Oxford eee 
Nanopore Technologies (black). Current distributions are tightly clustered with the 
exception of lower-quality ground #2. 


current (pA) 
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Pairwise identity 


(C) Density plots showing pairwise identity of nanopore reads collected on the ISS 
(left panel) and on the ground (right panel) relative to lambda (left box), E. coli 
(middle box) and mouse (right box) genomes. Abbreviations: 2D, high-quality two- 
dimensional reads; template, 1D template read; complement, complement read. 

(D) Pie charts of the read distributions corresponding to each ISS run and pooled ISS isis eee 
runs | — 4, in comparison to that obtained from a ground [lumina MiSeq run of the / | rnd 1737904 (018%) 


(17.5%) 


same sample mixture. wan <q sana 


(34.4%) (47.9%) 
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(A) Flow chart of the SURPI7ti 
bioinformatics pipeline for real-time 
microbial detection from nanopore 
data. 


(CED DYoyatelimelst-vuncmeymncr-lemerlisuleltlateyets 
corresponding to all reads (left), 
viruses, (upper right), and bacteria 
(upper right) from ISS run |. These 
charts were generated dynamically as 
part of a real-time sequencing analysis 
simulation using SURPIrt. 
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Biomolecule Sequencer: 
Automated Analysis Pipeline 


continuous 
directory scanning 


data transfer 
and basecalling 


nanopore 
sequencing 


ISS run #1 
(SURPIrt, unclassified) 


Viral Species 


Bacteria 
(6,279, 42.3%) 


Enterobacteria phage 
: lambda (480, 92.3%) 
Low quality / 


complexity (14) Non-host 


VINIS6S 5 Eukaryote (0.40% 
(520, 3.5%) o. 14,859 ME other (1.9%) 


{donation s, Bacterial Species 


Escherichia colli 


All reads 
(5,963, 95.0%) 


Mus musculus 
(6,220, 41.8%) 
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computational host 
subtraction 
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SURPIrt 


microbial real-time graphical 
identification visualization (SURPIviz) 


Enterobacteria phage HK629 (22) 
Enterobacteria phage HK630 (10) 
Enterobacteria phage 933W sensu lato (1) 
Shigella phage 75/02 Stx (2) 
Enterobacteria phage mEpX2 (1) 
uncultured virus (1) 
Enterobacteria phage Sf101 (1) 
Stx-2 converting phage Stx2a_WGPS9 (1) 
Stx-2 converting phage Stx2a_F4S1 (1) 


r- uncultured bacterium (38) 
Shigella flexneri (129) 
Shigella sonnei (64) 
Shigella boyadil (48) 
Shigella dysenteriae (13) 
uncultured bacterium Contig125 (1) 
Salmonella enterica (6) 
Raoultella omithinollytica (2) 
Citrobacter amalonaticus (3) 
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(C) Stacked distributions of reads from ISS runs 1| through 
4 aligning to mouse, E. coli, and lambda. Subgroup | shows 
raw SURPIrt output in the absence of taxonomic 
classification, while subgroups 2 and 3 show the effects of 
classification using the GenBank NT database and separate 
viral or bacterial databases, respectively. The relative 
proportions of read counts from SURPIrt differ from those 
obtained by GraphMap alignment to the most closely 
matched reference genome in NCBI NT (subgroup 4). 


(D) Coverage (green) and pairwise identity plots (purple) of 
raw nanopore reads mapped the E. coli (upper panel), the 
mouse mitochondrial (lower left panel), and lambda 
genomes (lower left panel). Reads are mapped to the most 
closely matched reference genome identified by SURPIrt. 


Biomolecule Sequencer: 
Automated Analysis Pipeline 


Missmun#1 ij tsSrun#2 [jisSrun#3 (Fj ISSrun #4 


—s : 


mouse E.coli lambda mouse E.coli lambda mouse E.coli lambda 


Subgroup 1 Subgroup 2 Subgroup 
(SURPIrt, unclassified) (SURPIrt, all NT classification) (SURPIrt, viral / bacterial 
NT classification) 


E ops - 12 MG1655 (gil999847124 
n=41,315 reads, 36.8X avg depth, 99.8% complete, 99.96% identity 
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mouse’ E.coli lambda 
(all hits) 


Subgroup 4 
(GraphMap alignment to top 
matching reference genome) 


4000 4500 Mb 


Mus musculus mitochondrial genome (gi|13838|, 16,295 bp) Enterobacteria phage lambda (gi|215104|, 48,502 bp) 
n=29 reads, 5.2X avg depth, 100% complete, 96.72% identity n=17,287 reads, 1,486X avg depth, 100% complete, 100% identity 


10 15 20 25 30 35 40 45kb 


INE-lafeyal-] W-X-1 ge) are lUldcece-lave ms) oy-(e-m-Ve lanl lal ciae-laceyal 


17 


Looking Ahead: Need for Sample Preparation 


Swab to Sequencer Sample Preparation Process 


SSS san000 


- DNA =» Library 
Extraction Amplification Preparation Sequencing 
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Biomolecule Sequencer: The Future 


¢ Maintain the Biomolecule Sequencer on the ISS 
as a permanent operational and research facility. 


¢ Continue initiatives to develop the capabilities to 
perform the sample collection and preparation 
on orbit, allowing an endless number of 
potential experiments. 

¢ Nanopore sequencing can go far beyond DNA 
and can enable methylation, epigenetics, RNA, 
modified bases, and protein studies. 
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Biomolecule Sequencer: The Team 
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¢ Sarah Castro-Wallace, Ph.D.(PM) ¢ Mark Lupisella, Ph.D. ¢ Sneha Somasekar « Sissel Juul, Ph.D. 
NASA JSC NASA GSFC . Doug Stryke e Daniel Turner, Ph.D. 
¢ Kristen John, Ph.D. (Deputy PM ¢ David Smith, Ph.D. ¢ Guixia Yu ¢ Michael Micorescu, 
and PE) NASA ARC UCSF Ph.D. 
NASA JSC ¢ Tim Stephenson, Ph.D. e Chris Mason, Ph.D. 
¢ Sarah Stahl, M.S., (PS) NASA GSFC ¢ Noah Alexander 
NASA JSC ¢ Doug Botkin, Ph.D. ¢ Alexa McIntyre 
Patelsoreravelernimereyetienie-vele WCMC 
Astronaut 
¢ Kate Rubins, Ph.D. 
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Special thanks to Marc Reagan for testing during NEEMO 21 engineering week 
and Drs. Lindsay Rizzardi and Andy Feinberg for parabolic flight testing! 
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Biomolecule Sequencer: Questions? 


Thank you! 


David.J.Smith-3 @nasa.gov 
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